
## :hammer_and_wrench: Installation

To install, run the following commands to install the required packages:

```
conda env create -f environment.yml
conda activate halc
```

## :bee: LVLM Backbones

The following evaluation requires for MSCOCO 2014 dataset. Please download [here](https://cocodataset.org/#home) and extract it in your data path.

Besides, you need to prepare the following checkpoints of 7B base models:

- Download [LLaVA-1.5 merged 7B model](https://huggingface.co/liuhaotian/llava-v1.5-7b) and specify it at [Line 14](https://github.com/BillChan226/HALC/blob/924cdc09310df8826fe2f8e2e16c25a6312a48b7/eval_configs/llava-1.5_eval.yaml#L14) of `eval_configs/llava-1.5_eval.yaml`.
- Download [LLaMA-2 7B model](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf/tree/main) and specify it at [Line 15](https://github.com/BillChan226/HALC/blob/924cdc09310df8826fe2f8e2e16c25a6312a48b7/minigpt4/configs/models/minigpt4_llama2.yaml#L15) of `minigpt4/configs/models/minigpt4_llama2.yaml`.
- Download [Vicuna 7B v1.1 model](https://github.com/lm-sys/FastChat) and specify it at [Line 25](https://github.com/BillChan226/HALC/blob/924cdc09310df8826fe2f8e2e16c25a6312a48b7/minigpt4/configs/models/blip2_instruct_vicuna7b.yaml#L25) of `minigpt4/configs/models/blip2_instruct_vicuna7b.yaml`.
- Download [Vicuna 7B v0 model](https://huggingface.co/Vision-CAIR/vicuna-7b/tree/main) and specify it at [Line 18](https://github.com/BillChan226/HALC/blob/924cdc09310df8826fe2f8e2e16c25a6312a48b7/minigpt4/configs/models/minigpt4_vicuna0.yaml#L18) of `minigpt4/configs/models/minigpt4_vicuna0.yaml`.
- Download [MiniGPT-4 7B pretrained weights](https://drive.google.com/file/d/1RY9jV0dyqLX-o38LrumkKRh6Jtaop58R/view?usp=sharing) and specify it at [Line 8](https://github.com/BillChan226/HALC/blob/924cdc09310df8826fe2f8e2e16c25a6312a48b7/eval_configs/minigpt4_eval.yaml#L8C10-L8C10) of `eval_configs/minigpt4_eval.yaml`.
- Download [MiniGPT-4 7B pretrained weights for LlaMA-2](https://drive.google.com/file/d/1RY9jV0dyqLX-o38LrumkKRh6Jtaop58R/view?usp=sharing) and specify it at [Line 8](https://github.com/BillChan226/HALC/blob/924cdc09310df8826fe2f8e2e16c25a6312a48b7/eval_configs/minigpt4_llama2_eval.yaml#L8) of `eval_configs/minigpt4_llama2_eval.yaml`.
- Download [mPLUG-Owl2 7B pretrained weights](https://huggingface.co/MAGAer13/mplug-owl2-llama2-7b) and specify it at [Line 14](https://github.com/BillChan226/HALC/blob/924cdc09310df8826fe2f8e2e16c25a6312a48b7/eval_configs/mplug-owl2_eval.yaml#L14) of `eval_configs/mplug-owl2_eval.yaml`.

### Arguments

| Argument             | Example             | Description   |
| -------------------- | ------------------- | ------------- |
| `--model`    | `llava-1.5` | Specify the MLLM model, this codebase supports `instructblip`, `minigpt4`, `llava-1.5`. |
| `--data-path`     | `/path/to/dataset` | Path to the dataset file or folder, e.g., `COCO_2014/val2014/`. |
| `--pope-type`     | `random` | Type for POPE evaluation, supports `random`, `popular`, `adversarial`. |
| `--beam`   | `3` | Beam size for global search. Default: 1. |


## :hourglass: Benchmarking OH

#### :chair: Running CHAIR evaluation for LVLMs object hallucination

Following [Evaluating Object Hallucination in Large Vision-Language Models](https://arxiv.org/pdf/2305.10355.pdf), we used "Please describe this image in detail." as the prompt to query LVLM for captions of the `500` images randomly sampled from [COCO 2014 Val](https://cocodataset.org/#download) datast. Under root directory, run

```
python run_scripts/caption_generation.py --model [LVLM Backbone] --data_path [COCO_DIR] -d [Decoding Strategy] --num_samples 500 --seed [SEED] --gpu-id [GPU_IDs] --output_dir ./generated_captions/ --debugging 1
```

`--debugging 1` will print the intermediate hallucination correction process of HALC.

### Evaluation

#### CHAIR Scores

After preparing your caption files using the above commands, you can either choose to evaluate the captions in an **one-shot mode** (single caption) or **batch mode** (all the caption files in a folder). To evaluate a single caption file,

```
python eval/eval_hallucination.py --metric chair --chair_input_path [PATH_TO_CAPTION_DIR] -v
```

To evaluate a batch of caption files, run

```
python eval/caption_to_chair.py -c [PATH_TO_CAPTION_FOLDER_DIR]
```

to convert the caption files to the format ready for CHAIR evaluation in the same directory first. Then a `_chair.json` file will be produced under this folder. To further evaluate the CHAIR score as well as the generation quality scores, run
```shell
python eval/batch_eval.py -c [PATH_TO_CAPTION_FOLDER_DIR] --evaluator chair --coco_path [COCO_DIR]
```

Note that `[COCO_DIR]` is expected to contain both images and annotation files within the `annotations` subfolder. In other words, `[COCO_DIR]` should the the following structure:

```
COCO_DIR (val2014 for example)
  - annotations
    - captions_val2014.json
    - captions_val2014.json
    - instances_train2014.json
    - instances_val2014.json
    - person_keypoints_train2014.json
    - person_keypoints_val2014.json
  - COCO_val2014_000000000042.jpg
  - COCO_val2014_000000000073.jpg
  ...
```